Contextual Knowledge Representation for Requirements Documents in Natural Language

نویسندگان

  • Beum-Seuk Lee
  • Barrett R. Bryant
چکیده

In software requirements engineering there have been very few attempts to automate the translation from a requirements document written in a natural language (NL) to one of the formal specification languages. One of the major reasons for this challenge comes from the ambiguity of the NL requirements documentation because NL depends heavily on context. To make a smooth transition from NL requirements to one of the formal specification languages we need a precise yet expressive knowledge representation that captures not only syntactic but also contextual information of the requirements. We propose the Contextual Natural Language Processing to overcome the ambiguity in NL using this contextual knowledge representation and Two-Level Grammar (TLG) to construct a bridge between a NL requirements specification and a formal specification to promote rapid prototyping and reusability of requirements documents. Problem Statement and Prior Research When a complex system with heavy interactions among its components is to be built, first the requirements of the system are spelled out according to the desires of the stakeholders. Several formal specification languages have been developed to formally describe the system (Alagar & Periyasamy 1998) based on decomposition and abstraction information of the requirements. However still the natural language (NL) has remained as the practical choice for the domain experts to specify the system because those formal specification languages are not easy to master. In addition, the process of the elicitation and negotiation of the requirements is carried out usually in natural language. Therefore the requirements documentation written in NL has to be reinterpreted into a formal specification language by software engineers. Pohl rightly stated regarding this process that improving an opaque system comprehension into a complete system specification and transforming informal knowledge into formal representations are the major tasks in requirements engineering (Pohl 1993). When the system is very complicated, which is mostly the case when one chooses to use formal specification, this conversion is both non-trivial and errorprone, if not implausible. The challenge of formalizing the requirements document results from many factors such as Copyright c 2002, American Association for Artificial Intelligence (www.aaai.org). All rights reserved. miscommunication between domain experts and engineers. However the major bottleneck of this conversion is from the inborn characteristic of ambiguity of NL and the different level of the formalism between the two domains of NL and the formal specification. This is why there have been very few attempts to automate the conversion from requirements documentation to a formal specification language. To handle the problem of ambiguity and different formalisms, some have argued that the requirements document has to be written in a particular way to reduce ambiguity in the document (Wilson 1999). Others have proposed controlled natural languages (e.g., Attempto Controlled English (ACE) (Fuchs & Schwitter 1996)) which limit the syntax and semantics of NL to avoid the ambiguity problem. Even though the former approach provides a better documentation to work with, it hasn’t accomplished any automated conversion from a natural language requirements document to a formal specification language. The latter has similar goals as ours to realize the automated conversion but restrictions on the syntax and semantics of the language result in losing the flexibility of NL. Also the user still has to remember the restrictions. Moreover the target language of this controlled language is PROLOG which is good for prototyping but lacks important properties such as strong typing to be used as a formal specification language. Another approach to natural language requirements analysis is to search each line of the requirements document for specific words and phrases for the purpose of quality analysis (Wilson, Rosenberg, & Hyatt 1996). A similar project (Girardi 1996) focuses mainly on the automatic indexing and reuse of the software components in the requirements documents. In summary, the linguistic descriptions in the requirements document as the inputs for these systems have been too restricted and controlled. Also the related research so far, only focusing on the validation and verification of requirements, has not achieved a full conversion of the requirements specifications into a formal specification and implementation of the specifications.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Knowledge Representation of Requirements Documents Using Natural Language Processing

Complex systems such as automotive software systems are usually broken down into subsystems that are specified and developed in isolation and afterwards integrated to provide the functionality of the desired system. This results in a large number of requirements documents for each subsystem written by different people and in different departments. Requirements engineers are challenged by compre...

متن کامل

An architecture for defining features and exploring interactions

The last decade has seen an explosive growth in the development of telephony features. The description and design of new features are fraught with errors due to this growth’s impact on our ability to recognize interactions and the current practice of describing a feature’s requirements using natural language. While the use of natural language eases the communication of requirements between the ...

متن کامل

Creating Knowledge Repositories from Biomedical Reports: The MEDSYNDIKATE Text Mining System

MEDSYNDIKATE is a natural language processor for automatically acquiring knowledge from medical finding reports. The content of these documents is transferred to formal representation structures which constitute a corresponding text knowledge base. The system architecture integrates requirements from the analysis of single sentences, as well as those of referentially linked sentences forming co...

متن کامل

On "deep" knowledge extraction from documents

SYNDIKATE comprises a family of natural language understanding systems for automatically acquiring knowledge from real-world texts (e.g., information technology test reports, medical finding reports), and for transferring their content to formal representation structures which constitute a corresponding text knowledge base. We present a general system architecture which integrates requirements ...

متن کامل

Knowledge Representation, Sharing and Retrieval on the Web

By “knowledge retrieval”, we refer to the automatic retrieval of statements permitting a tool to make logical inferences and answer queries precisely and correctly, as opposed to retrieving documents or statements “related to” the queries. Given the ambiguity of natural language and our current inability to make computers “understand” it, the knowledge has to be manually encoded and structured ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002